10 research outputs found

    leave a trace - A People Tracking System Meets Anomaly Detection

    Full text link
    Video surveillance always had a negative connotation, among others because of the loss of privacy and because it may not automatically increase public safety. If it was able to detect atypical (i.e. dangerous) situations in real time, autonomously and anonymously, this could change. A prerequisite for this is a reliable automatic detection of possibly dangerous situations from video data. This is done classically by object extraction and tracking. From the derived trajectories, we then want to determine dangerous situations by detecting atypical trajectories. However, due to ethical considerations it is better to develop such a system on data without people being threatened or even harmed, plus with having them know that there is such a tracking system installed. Another important point is that these situations do not occur very often in real, public CCTV areas and may be captured properly even less. In the artistic project leave a trace the tracked objects, people in an atrium of a institutional building, become actor and thus part of the installation. Visualisation in real-time allows interaction by these actors, which in turn creates many atypical interaction situations on which we can develop our situation detection. The data set has evolved over three years and hence, is huge. In this article we describe the tracking system and several approaches for the detection of atypical trajectories

    3D real time object recognition

    Get PDF
    Die Objekterkennung ist ein natürlicher Prozess im Menschlichen Gehirn. Sie ndet im visuellen Kortex statt und nutzt die binokulare Eigenschaft der Augen, die eine drei- dimensionale Interpretation von Objekten in einer Szene erlaubt. Kameras ahmen das menschliche Auge nach. Bilder von zwei Kameras, in einem Stereokamerasystem, werden von Algorithmen für eine automatische, dreidimensionale Interpretation von Objekten in einer Szene benutzt. Die Entwicklung von Hard- und Software verbessern den maschinellen Prozess der Objek- terkennung und erreicht qualitativ immer mehr die Fähigkeiten des menschlichen Gehirns. Das Hauptziel dieses Forschungsfeldes ist die Entwicklung von robusten Algorithmen für die Szeneninterpretation. Sehr viel Aufwand wurde in den letzten Jahren in der zweidimen- sionale Objekterkennung betrieben, im Gegensatz zur Forschung zur dreidimensionalen Erkennung. Im Rahmen dieser Arbeit soll demnach die dreidimensionale Objekterkennung weiterent- wickelt werden: hin zu einer besseren Interpretation und einem besseren Verstehen von sichtbarer Realität wie auch der Beziehung zwischen Objekten in einer Szene. In den letzten Jahren aufkommende low-cost Verbrauchersensoren, wie die Microsoft Kinect, generieren Farb- und Tiefendaten einer Szene, um menschenähnliche visuelle Daten zu generieren. Das Ziel hier ist zu zeigen, wie diese Daten benutzt werden können, um eine neue Klasse von dreidimensionalen Objekterkennungsalgorithmen zu entwickeln - analog zur Verarbeitung im menschlichen Gehirn.Object recognition is a natural process of the human brain performed in the visual cor- tex and relies on a binocular depth perception system that renders a three-dimensional representation of the objects in a scene. Hitherto, computer and software systems are been used to simulate the perception of three-dimensional environments with the aid of sensors to capture real-time images. In the process, such images are used as input data for further analysis and development of algorithms, an essential ingredient for simulating the complexity of human vision, so as to achieve scene interpretation for object recognition, similar to the way the human brain perceives it. The rapid pace of technological advancements in hardware and software, are continuously bringing the machine-based process for object recognition nearer to the inhuman vision prototype. The key in this eld, is the development of algorithms in order to achieve robust scene interpretation. A lot of recognisable and signi cant e ort has been successfully carried out over the years in 2D object recognition, as opposed to 3D. It is therefore, within this context and scope of this dissertation, to contribute towards the enhancement of 3D object recognition; a better interpretation and understanding of reality and the relationship between objects in a scene. Through the use and application of low-cost commodity sensors, such as Microsoft Kinect, RGB and depth data of a scene have been retrieved and manipulated in order to generate human-like visual perception data. The goal herein is to show how RGB and depth information can be utilised in order to develop a new class of 3D object recognition algorithms, analogous to the perception processed by the human brain

    A Quality Evaluation of Single and Multiple Camera Calibration Approaches for an Indoor Multi Camera Tracking System

    No full text
    Human detection and tracking has been a prominent research area for several scientists around the globe. State of the art algorithms have been implemented, refined and accelerated to significantly improve the detection rate and eliminate false positives. While 2D approaches are well investigated, 3D human detection and tracking is still an unexplored research field. In both 2D/3D cases, introducing a multi camera system could vastly expand the accuracy and confidence of the tracking process. Within this work, a quality evaluation is performed on a multi RGB-D camera indoor tracking system for examining how camera calibration and pose can affect the quality of human tracks in the scene, independently from the detection and tracking approach used. After performing a calibration step on every Kinect sensor, state of the art single camera pose estimators were evaluated for checking how good the quality of the poses is estimated using planar objects such as an ordinate chessboard. With this information, a bundle block adjustment and ICP were performed for verifying the accuracy of the single pose estimators in a multi camera configuration system. Results have shown that single camera estimators provide high accuracy results of less than half a pixel forcing the bundle to converge after very few iterations. In relation to ICP, relative information between cloud pairs is more or less preserved giving a low score of fitting between concatenated pairs. Finally, sensor calibration proved to be an essential step for achieving maximum accuracy in the generated point clouds, and therefore in the accuracy of the produced 3D trajectories, from each sensor

    Calibration of a multiple stereo and RGB-D camera system for 3D human tracking

    No full text
    Human Tracking in Computer Vision is a very active up-going research area. Previous works analyze this topic by applying algorithms and features extraction in 2D, while 3D tracking is quite an unexplored filed, especially concerning multi–camera systems. Our approach discussed in this paper is focused on the detection and tracking of human postures using multiple RGB–D data together with stereo cameras. We use low–cost devices, such as Microsoft Kinect and a people counter, based on a stereo system. The novelty of our technique concerns the synchronization of multiple devices and the determination of their exterior and relative orientation in space, based on a common world coordinate system. Furthermore, this is used for applying Bundle Adjustment to obtain a unique 3D scene, which is then used as a starting point for the detection and tracking of humans and extract significant metrics from the datasets acquired. In this article, the approaches are described for the determination of the exterior and absolute orientation. Subsequently, it is shown how a common point cloud is formed. Finally, some results for object detection and tracking, based on 3D point clouds, are presented

    AlphaGAN: Generative adversarial networks for natural image matting

    No full text
    We present the first generative adversarial network (GAN) for natural image matting. Our novel generator network is trained to predict visually appealing alphas with the addition of the adversarial loss from the discriminator that is trained to classify wellcomposited images. Further, we improve existing encoder-decoder architectures to better deal with the spatial localization issues inherited in convolutional neural networks (CNN) by using dilated convolutions to capture global context information without downscaling feature maps and losing spatial information. We present state-of-the-art results on the alphamatting online benchmark for the gradient error and give comparable results in others. Our method is particularly well suited for fine structures like hair, which is of great importance in practical matting applications, e.g. in film/TV production

    Human Recognition in RGBD Combining Object Detectors and Conditional Random Fields

    No full text
    This paper addresses the problem of detecting and segmenting human instances in a point cloud. Both fields have been well studied during the last decades showing impressive results, not only in accuracy but also in computational performance. With the rapid use of depth sensors, a resurgent need for improving existing state-of-the-art algorithms, integrating depth information as an additional constraint became more ostensible. Current challenges involve combining RGB and depth information for reasoning about location and spatial extend of the object of interest. We make use of an improved deformable part model algorithm, allowing to deform the individual parts across multiple scales, approximating the location of the person in the scene and a conditional random field energy function for specifying the object’s spatial extent. Our proposed energy function models up to pairwise relations defined in the RGBD domain, enforcing label consistency for regions sharing similar unary and pairwise measurements. Experimental results show that our proposed energy func- tion provides a fairly precise segmentation even when the resulting detection box is imprecise. Reasoning about the detection algorithm could potentially enhance the quality of the detection box allowing capturing the object of interest as a whole

    Towards a 3D Pipeline for Monitoring and Tracking People in an Indoor Scenario using multiple RGBD Sensors

    No full text
    This paper addresses the problem of detecting and segmenting human instances in a point cloud. Both fields have been well studied during the last decades showing impressive results, not only in accuracy but also in computational performance. With the rapid use of depth sensors, a resurgent need for improving existing state-of-the-art algorithms, integrating depth information as an additional constraint became more ostensible. Current challenges involve combining RGB and depth information for reasoning about location and spatial extent of the object of interest. We make use of an improved deformable part model algorithm, allowing to deform the individual parts across multiple scales, approximating the location of the person in the scene and a conditional random field energy function for specifying the object’s spatial extent. Our proposed energy function models up to pairwise relations defined in the RGBD domain, enforcing label consistency for regions sharing similar unary and pairwis e measurements. Experimental results show that our proposed energy function provides a fairly precise segmentation even when the resulting detection box is imprecise. Reasoning about the detection algorithm could potentially enhance the quality of the detection box allowing capturing the object of interest as a whole

    Ανάπτυξη container αναδιάταξης υλικού σε επεξεργαστή προγραμματιζόμενων λογικών πόρων αρχιτεκτονικής RISCV

    No full text
    Summarization: With hardware designs becoming more and more complicated in order to serve the current age demands, many hardware teams tend to use reconfig-urable hardware to test their designs before putting them to market as a cost-effective practice. These demands also make the hardware teams divided into sections to fulfill the requirements of the design and consequently, cloud services are becoming even more necessary. With those arguments in mind, this thesis provides a solution for a RISC-V deployment platform in order for developers or teams thereof, to upload and test their custom hardware base on a complete RISC-V processor.Περίληψη: Με τις σχεδιάσεις υλικού να γίνονται όλο και περισσότερο περίπλοκες έτσι ώστε να εξυπηρετήσουν τις σύγχρονες ανάγκες, πολλές ομάδες σχεδιασμού τείνουν να χρησιμοποιούν αναδιατασσόμενα συστήματα έτσι ώστε να δοκιμάσουν τις σχεδιάσεις τους πριν την διαθεσιμότητά τους στην αγορά. Αυτές οι απαιτήσεις έχουν ως αποτέλεσμα τη διάσπαση των ομάδων ανάπτυξης υλικού σε τομείς, επομένως, έχει προκύψει ανάγκη για λύσεις οι οποίες προσφέρουν πόρους για τη σχεδίαση και ανάπτυξη υλικού μέσω του νέφους. Με γνώμονα αυτές τις νέες ανάγκες, αυτή η διπλωματική παρουσιάζει μία τέτοια λύση ανάπτυξης υλικού μέσω νέφους, βασισμένη σε επεξεργαστή αρχιτεκτονικής RISC-V. Η πλατφόρμα αυτή προσφέρει τη δυνατότητα σε ομάδες ανάπτυξης υλικού να ανεβάσουν στο νέφος μία σχεδίαση υλικού και να τη δοκιμάσουν σε μία ολοκληρωμένη πλατφόρμα βασισμένη σε έναν επεξεργαστή RISC-V

    SIGGRAPH 2018

    No full text
    This poster describes a reinterpretation of Samuel Beckett?s theatrical text Playfor virtual reality (VR). It is an aesthetic reflection on practice that follows up an a technical project description submittedto ISMAR 2017 [O?Dwyer et al. 2017]. Actors are captured in a green screen environment using free-viewpoint video (FVV) techniques,and the scene is built in a game engine, complete with binaural spatial audio and six degrees of freedom of movement. The project explores how ludic qualities in the original text help elicit the conversational and interactive specificities of the digital medium. The work affirms the potential for interactive narrative in VR, opensnew experiences of the text, and highlights the reorganisation of the author?audience dynami
    corecore